Recognizing Cursive Typewritten Text Using Segmentation-Free System

نویسنده

  • Mohammad S. Khorsheed
چکیده

Feature extraction plays an important role in text recognition as it aims to capture essential characteristics of the text image. Feature extraction algorithms widely range between robust and hard to extract features and noise sensitive and easy to extract features. Among those feature types are statistical features which are derived from the statistical distribution of the image pixels. This paper presents a novel method for feature extraction where simple statistical features are extracted from a one-pixel wide window that slides across the text line. The feature set is clustered in the feature space using vector quantization. The feature vector sequence is then injected to a classification engine for training and recognition purposes. The recognition system is applied to a data corpus which includes cursive Arabic text of more than 600 A4-size sheets typewritten in multiple computer-generated fonts. The system performance is compared to a previously published system from the literature with a similar engine but a different feature set.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Region growing based segmentation algorithm for typewritten and handwritten text recognition

This paper presents a new technique of high accuracy to recognize both typewritten and handwritten English and Arabic texts without thinning. After segmenting the text into lines (horizontal segmentation) and the lines into words, it separates the word into its letters. Separating a text line (row) into words and a word into letters is performed by using the region growing technique (implicit s...

متن کامل

A Markovian Engine for Text Recognition: - Cursive Arabic Text, Statistical Features and Interconnected HMMs

This paper presents a cursive Arabic text recognition system. The system decomposes the document image into text line images and extracts a set of simple statistical features from a one-pixel width window which is sliding a cross that text line. It then injects the resulting feature vectors to Hidden Markov Models. The proposed system is applied to a data corpus which includes Arabic text of mo...

متن کامل

A Simple and Effective Cursive Word Segmentation Method

Segmentation is the operation that seeks to decompose a word image in a sequence of subimages containing isolated characters. Segmentation is a critical phase of the single word recognition process, and this is witnessed by the higher performance for the recognition of isolated characters vs. that obtained for cursive words. There are two main strategies for segmentation [1]. Straight segmentat...

متن کامل

Microsoft Word - CONTENTS-AUGUST07

The last two decades witnessed some advances in the development of an Arabic character recognition (CR) system. Arabic CR faces technical problems not encountered in any other language that make Arabic CR systems achieve relatively low accuracy and retards establishing them as market products. We propose the basic stages towards a system that attacks the problem of recognizing online Arabic cur...

متن کامل

Simultaneous Segmentation and Recognition of Arabic Characters in an Unconstrained On-Line Cursive Handwritten Document

The last two decades witnessed some advances in the development of an Arabic character recognition (CR) system. Arabic CR faces technical problems not encountered in any other language that make Arabic CR systems achieve relatively low accuracy and retards establishing them as market products. We propose the basic stages towards a system that attacks the problem of recognizing online Arabic cur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2015  شماره 

صفحات  -

تاریخ انتشار 2015